Amendment to the Claims 



We claim: 

1-68. (Cancelled) 



69. (New) A computerized method of identifying a molecular feature set likely to be 
responsible for a given activity, based on a set of input data that represents molecules and that 
defines respectively for each molecule a molecular structure and an activity characteristic, the 
method comprising: 

establishing for each molecule a respective description, by comparison of the molecule's 
molecular structure to a set of molecular substructure keys; 

grouping the molecules based on similarity of their respective descriptions and without 
consideration of their respective activity characteristics, so as to define groups of structurally 
similar molecules; 

selecting at least one of the groups of structurally similar molecules based on an extent to 
which the molecules in the selected group have the given activity; 

for each of the at least one selected group, identifying at least one molecular feature set 
common to all of the molecules in the selected group; and 

outputting data indicative of at least one identified molecular feature set. 



70. (New) The computerized method of claim 69, wherein grouping the molecules 
based on similarity of their respective descriptions comprises: 

applying a clustering algorithm to cluster the molecules based on their descriptions. 
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71. (New) The computerized method of claim 70, wherein the clustering algorithm 
comprises Self-Organizing-Map (SOM) clustering. 

72, (New) The computerized method of claim 69, wherein selecting at least one of 
the groups based on an extent to which the molecules in the selected group have the given activity 
comprises: 

selecting a group because the group contains at least a predetermined number of molecules 
that have the given activity. 



73. (New) The computerized method of claim 69, wherein selecting at least one of 
the groups based on an extent to which the molecules in the selected group have the given activity 
comprises: 

selecting a group because at least a predetermined percent of the molecules in the group 
have the given activity. 



74. (New) The computerized method of claim 69, wherein identifying at least one 
molecular feature set common to all of the molecules in the selected group comprises identifying a 
maximum common substructure of the molecules in the selected group. 

75. (New) The computerized method of claim 74, wherein identifying a maximum 
common substructure of the molecules in the selected group comprises applying subgraph 
isomorphism to compare the descriptions of the molecules in the selected group. 
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76. (New) The computerized method of claim 74, wherein identifying a maximum 
common substructure of the molecules in the selected group comprises applying a genetic algorithm 
to compare the descriptions of the molecules in the selected group. 

77. (New) The computerized method of claim 69, wherein the molecular feature set 
common to all of the molecules in selected group is a contiguous molecular structure. 

78. (New) The computerized method of claim 69, wherein the molecular feature set 
common to all of the molecules in the selected group is a non-contiguous combination of molecular 
features. 

79. (New) The computerized method of claim 69, wherein identifying at least one 
molecular feature set common to all of the molecules in selected group comprises: 

identifying a plurality of molecular feature sets each common to all of the molecules in the 
selected group. 

80. (New) The computerized method of claim 79, wherein outputting data indicative 
of at least one identified molecular feature set comprises: 

outputting data indicative of the plurality of molecular feature sets. 

81. (New) A computer-readable medium containing a set of machine language 
instructions executable by a computer to carry out the method of claim 69. 
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82. (New) A machine programmed with machine language instructions executable by 
a processor to carry out the method of claim 69. 



83. (New) A method of identifying a molecular feature set likely to be responsible for 
a given activity, the method comprising: 

receiving into a computer a set of input data that represents molecules and that defines, 
respectively for each molecule, a molecular structure and an activity characteristic; 

operating the computer to establish for each molecule a respective description vector, by 
comparison of the molecule's molecular structure to a set of molecular substructure keys; 

operating the computer to apply a clustering algorithm so as to sort the molecules into 
groups based on similarity of their respective description vectors and without consideration of their 
respective activity characteristics; 

operating the computer to select at least one of the groups based on an extent to which the 
molecules in the selected group have the given activity; 

operating the computer to identify, for each of the at least one selected group, a maximum 
common substructure of the molecules in the selected group; and 

outputting from the computer data indicative of at least one identified molecular feature set. 

84. (New) A computerized method of identifying a molecular feature set likely to be 
responsible for a given activity, based on a set of input data that represents molecules and that 
defines respectively for each molecule a molecular structure and an activity characteristic, the 
method comprising: 
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(a) establishing for each molecule a respective description, by comparison of the 
molecule's molecular structure to a set of molecular substructure keys; 

(b) grouping the molecules based on similarity of their respective descriptions and 
without consideration of their respective activity characteristics, so as to define groups of 
structurally similar molecules; 

(c) selecting at least one of the groups of structurally similar molecules based on an 
extent to which the molecules in the selected group have the given activity; 

(d) for each of the at least one selected group, identifying at least one molecular feature 
set common to all of the molecules in selected group; 

(e) adding at least one identified molecular feature set as a new substructure key in the 
set of molecular substructure keys, so as to produce a modified set of molecular substructure keys, 
and then repeating elements (a) through (d) using the modified set of molecular substructure keys as 
the set of molecular substructure keys; and 

(f) outputting data indicative of at least one identified molecular feature set. 

85. (New) The method of claim 84, wherein outputting data indicative of at least one 
identified molecular feature set comprises: 

determining which identified molecular feature set has the most atoms, and outputting data 
indicative of that molecular feature set. 



86. (New) The method of claim 84, wherein grouping the molecules based on 
similarity of their respective descriptions comprises Self-Organizing-Map (SOM) clustering the 
molecules based on their respective descriptions. 
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87. (New) The method of claim 86, wherein SOM clustering the molecules results in 
a SOM grid reflecting clusters of molecules, and wherein outputting data indicative of at least one 
identified molecular feature set comprises: 

outputting a screen display that depicts contents of the SOM grid. 

88. (New) A computer-readable medium containing a set of machine language 
instructions executable by a computer to carry out the method of claim 84. 

89. (New) A machine programmed with program instructions executable by a 
processor to carry out the method of claim 84. 

90. (New) A processing system for modeling chemical structure-activity relationships 
through artificial intelligence analysis of an input data set representing molecules, each of the 
molecules having a set of features and an activity characteristic, the processing system comprising, 
in combination: 

means for establishing for each molecule a respective description, by comparison of the 
molecule's molecular structure to a set of molecular substructure keys; 

means for grouping the molecules based on similarity of their respective descriptions and 
without consideration of their respective activity characteristics, so as to define groups of 
structurally similar molecules; 

means for selecting at least one of the groups of structurally similar molecules based on an 
extent to which the molecules in the selected group have the given activity; 

7 

MCDONNELL BOEHNEN 
HULBERT & BERGHOFF 
300 SOUTH WACKER DRIVE 
CHICAGO, ILLINOIS 60608 
TELEPHONE (312)913-0001 



means for identifying at least one molecular feature set common to all of the molecules 
each of at least one selected group; and 

means for outputting data indicative of at least one identified molecular feature. 
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